{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "dd8b51d9-b28c-44b8-a73e-926c90b018a3",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/prompts/advanced_prompts.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c5ec4ea1-78bb-4dfe-b193-ab9ba5a10e4c",
   "metadata": {},
   "source": [
    "# Advanced Prompt Techniques (Variable Mappings, Functions)\n",
    "\n",
    "In this notebook we show some advanced prompt techniques. These features allow you to define more custom/expressive prompts, re-use existing ones, and also express certain operations in fewer lines of code.\n",
    "\n",
    "\n",
    "We show the following features:\n",
    "1. Partial formatting\n",
    "2. Prompt template variable mappings\n",
    "3. Prompt function mappings\n",
    "4. Dynamic few-shot examples"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0a50028b",
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install llama-index-llms-openai"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f435c4df-3682-4a8a-872b-a79ace3695ee",
   "metadata": {},
   "source": [
    "## 1. Partial Formatting\n",
    "\n",
    "Partial formatting (`partial_format`) allows you to partially format a prompt, filling in some variables while leaving others to be filled in later.\n",
    "\n",
    "This is a nice convenience function so you don't have to maintain all the required prompt variables all the way down to `format`, you can partially format as they come in.\n",
    "\n",
    "This will create a copy of the prompt template."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a19eaa7f-1e72-498f-8e29-fec9b1ef9ceb",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.prompts import RichPromptTemplate\n",
    "\n",
    "qa_prompt_tmpl_str = \"\"\"\\\n",
    "Context information is below.\n",
    "---------------------\n",
    "{{ context_str }}\n",
    "---------------------\n",
    "Given the context information and not prior knowledge, answer the query.\n",
    "Please write the answer in the style of {{ tone_name }}\n",
    "Query: {{ query_str }}\n",
    "Answer: \\\n",
    "\"\"\"\n",
    "\n",
    "prompt_tmpl = RichPromptTemplate(qa_prompt_tmpl_str)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "933db073-712d-4feb-b49f-6c64a20ec2fa",
   "metadata": {},
   "outputs": [],
   "source": [
    "partial_prompt_tmpl = prompt_tmpl.partial_format(tone_name=\"Shakespeare\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "470446e4-aeb9-40cb-9017-fcdd03af8d4c",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'tone_name': 'Shakespeare'}"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "partial_prompt_tmpl.kwargs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b9627977-5d2a-4300-a9da-91a5dfb671a3",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Context information is below.\n",
      "---------------------\n",
      "In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters\n",
      "---------------------\n",
      "Given the context information and not prior knowledge, answer the query.\n",
      "Please write the answer in the style of Shakespeare\n",
      "Query: How many params does llama 2 have\n",
      "Answer: \n"
     ]
    }
   ],
   "source": [
    "fmt_prompt = partial_prompt_tmpl.format(\n",
    "    context_str=\"In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters\",\n",
    "    query_str=\"How many params does llama 2 have\",\n",
    ")\n",
    "print(fmt_prompt)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "47497fe5",
   "metadata": {},
   "source": [
    "We can also use `format_messages` to format the prompt into `ChatMessage` objects."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "32452689",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ChatMessage(role=<MessageRole.USER: 'user'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text='Context information is below.'), TextBlock(block_type='text', text='---------------------'), TextBlock(block_type='text', text='In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters'), TextBlock(block_type='text', text='---------------------'), TextBlock(block_type='text', text='Given the context information and not prior knowledge, answer the query.'), TextBlock(block_type='text', text='Please write the answer in the style of Shakespeare'), TextBlock(block_type='text', text='Query: How many params does llama 2 have'), TextBlock(block_type='text', text='Answer:')])]\n"
     ]
    }
   ],
   "source": [
    "fmt_prompt = partial_prompt_tmpl.format_messages(\n",
    "    context_str=\"In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters\",\n",
    "    query_str=\"How many params does llama 2 have\",\n",
    ")\n",
    "print(fmt_prompt)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1cf5e3d8-a0f8-40fd-ba32-b72f01036c24",
   "metadata": {},
   "source": [
    "## 2. Prompt Template Variable Mappings\n",
    "\n",
    "Template var mappings allow you to specify a mapping from the \"expected\" prompt keys (e.g. `context_str` and `query_str` for response synthesis), with the keys actually in your template. \n",
    "\n",
    "This allows you re-use your existing string templates without having to annoyingly change out the template variables."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e197319a-37cc-4a3f-a623-8fffb9c3d932",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.prompts import RichPromptTemplate\n",
    "\n",
    "# NOTE: here notice we use `my_context` and `my_query` as template variables\n",
    "qa_prompt_tmpl_str = \"\"\"\\\n",
    "Context information is below.\n",
    "---------------------\n",
    "{{ my_context }}\n",
    "---------------------\n",
    "Given the context information and not prior knowledge, answer the query.\n",
    "Query: {{ my_query }}\n",
    "Answer: \\\n",
    "\"\"\"\n",
    "\n",
    "template_var_mappings = {\"context_str\": \"my_context\", \"query_str\": \"my_query\"}\n",
    "\n",
    "prompt_tmpl = RichPromptTemplate(\n",
    "    qa_prompt_tmpl_str, template_var_mappings=template_var_mappings\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "44936c0f-bae1-4955-b59f-4bcfb373bdc8",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Context information is below.\n",
      "---------------------\n",
      "In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters\n",
      "---------------------\n",
      "Given the context information and not prior knowledge, answer the query.\n",
      "Query: How many params does llama 2 have\n",
      "Answer: \n"
     ]
    }
   ],
   "source": [
    "fmt_prompt = prompt_tmpl.format(\n",
    "    context_str=\"In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters\",\n",
    "    query_str=\"How many params does llama 2 have\",\n",
    ")\n",
    "print(fmt_prompt)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4f21e05d-145a-4d0a-b36e-10e9685af32c",
   "metadata": {},
   "source": [
    "### 3. Prompt Function Mappings\n",
    "\n",
    "You can also pass in functions as template variables instead of fixed values.\n",
    "\n",
    "This allows you to dynamically inject certain values, dependent on other values, during query-time.\n",
    "\n",
    "Here are some basic examples. We show more advanced examples (e.g. few-shot examples) in our Prompt Engineering for RAG guide."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dffd1302-ec1c-411d-b0ef-23fd40ea4ba9",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.prompts import RichPromptTemplate\n",
    "\n",
    "qa_prompt_tmpl_str = \"\"\"\\\n",
    "Context information is below.\n",
    "---------------------\n",
    "{{ context_str }}\n",
    "---------------------\n",
    "Given the context information and not prior knowledge, answer the query.\n",
    "Query: {{ query_str }}\n",
    "Answer: \\\n",
    "\"\"\"\n",
    "\n",
    "\n",
    "def format_context_fn(**kwargs):\n",
    "    # format context with bullet points\n",
    "    context_list = kwargs[\"context_str\"].split(\"\\n\\n\")\n",
    "    fmtted_context = \"\\n\\n\".join([f\"- {c}\" for c in context_list])\n",
    "    return fmtted_context\n",
    "\n",
    "\n",
    "prompt_tmpl = RichPromptTemplate(\n",
    "    qa_prompt_tmpl_str, function_mappings={\"context_str\": format_context_fn}\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6e078289-f0bc-4848-9e97-7bf7eb4abbc1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Context information is below.\n",
      "---------------------\n",
      "- In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.\n",
      "\n",
      "- Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases.\n",
      "\n",
      "- Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closed-source models.\n",
      "\n",
      "---------------------\n",
      "Given the context information and not prior knowledge, answer the query.\n",
      "Query: How many params does llama 2 have\n",
      "Answer: \n"
     ]
    }
   ],
   "source": [
    "context_str = \"\"\"\\\n",
    "In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.\n",
    "\n",
    "Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases.\n",
    "\n",
    "Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closed-source models.\n",
    "\"\"\"\n",
    "\n",
    "fmt_prompt = prompt_tmpl.format(\n",
    "    context_str=context_str, query_str=\"How many params does llama 2 have\"\n",
    ")\n",
    "print(fmt_prompt)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aa7bd7f6",
   "metadata": {},
   "source": [
    "### 4. Dynamic few-shot examples\n",
    "\n",
    "Using the function mappings, you can also dynamically inject few-shot examples based on other prompt variables.\n",
    "\n",
    "Here's an example that uses a vector store to dynamically inject few-shot text-to-sql examples based on the query.\n",
    "\n",
    "First, lets define a text-to-sql prompt template."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "acf8a609",
   "metadata": {},
   "outputs": [],
   "source": [
    "text_to_sql_prompt_tmpl_str = \"\"\"\\\n",
    "You are a SQL expert. You are given a natural language query, and your job is to convert it into a SQL query.\n",
    "\n",
    "Here are some examples of how you should convert natural language to SQL:\n",
    "<examples>\n",
    "{{ examples }}\n",
    "</examples>\n",
    "\n",
    "Now it's your turn.\n",
    "\n",
    "Query: {{ query_str }}\n",
    "SQL: \n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c287eabe",
   "metadata": {},
   "source": [
    "Given this prompt template, lets define and index some few-shot text-to-sql examples."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "50289b48",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\"OPENAI_API_KEY\"] = \"sk-...\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5884fa74",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import Settings, VectorStoreIndex\n",
    "from llama_index.core.schema import TextNode\n",
    "from llama_index.llms.openai import OpenAI\n",
    "from llama_index.embeddings.openai import OpenAIEmbedding\n",
    "\n",
    "# Set global default LLM and embed model\n",
    "Settings.llm = OpenAI(model=\"gpt-4o-mini\")\n",
    "Settings.embed_model = OpenAIEmbedding(model=\"text-embedding-3-small\")\n",
    "\n",
    "# Setup few-shot examples\n",
    "example_nodes = [\n",
    "    TextNode(\n",
    "        text=\"Query: How many params does llama 2 have?\\nSQL: SELECT COUNT(*) FROM llama_2_params;\"\n",
    "    ),\n",
    "    TextNode(\n",
    "        text=\"Query: How many layers does llama 2 have?\\nSQL: SELECT COUNT(*) FROM llama_2_layers;\"\n",
    "    ),\n",
    "]\n",
    "\n",
    "# Create index\n",
    "index = VectorStoreIndex(nodes=example_nodes)\n",
    "\n",
    "# Create retriever\n",
    "retriever = index.as_retriever(similarity_top_k=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ccf76f6e",
   "metadata": {},
   "source": [
    "With our retriever, we can create our prompt template with function mappings to dynamically inject few-shot examples based on the query."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0afc1395",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.prompts import RichPromptTemplate\n",
    "\n",
    "\n",
    "def get_examples_fn(**kwargs):\n",
    "    query = kwargs[\"query_str\"]\n",
    "    examples = retriever.retrieve(query)\n",
    "    return \"\\n\\n\".join(node.text for node in examples)\n",
    "\n",
    "\n",
    "prompt_tmpl = RichPromptTemplate(\n",
    "    text_to_sql_prompt_tmpl_str,\n",
    "    function_mappings={\"examples\": get_examples_fn},\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6830c1c4",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "You are a SQL expert. You are given a natural language query, and your job is to convert it into a SQL query.\n",
      "\n",
      "Here are some examples of how you should convert natural language to SQL:\n",
      "<examples>\n",
      "Query: How many params does llama 2 have?\n",
      "SQL: SELECT COUNT(*) FROM llama_2_params;\n",
      "</examples>\n",
      "\n",
      "Now it's your turn.\n",
      "\n",
      "Query: What are the number of parameters in the llama 2 model?\n",
      "SQL: \n"
     ]
    }
   ],
   "source": [
    "prompt = prompt_tmpl.format(\n",
    "    query_str=\"What are the number of parameters in the llama 2 model?\"\n",
    ")\n",
    "print(prompt)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1a80524b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "SELECT COUNT(*) FROM llama_2_params;\n"
     ]
    }
   ],
   "source": [
    "response = Settings.llm.complete(prompt)\n",
    "print(response.text)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
